Improved language recognition using mixture components statistics
نویسندگان
چکیده
One successful approach to language recognition is to focus on the most discriminative high level features of languages, such as phones and words. In this paper, we applied a similar approach to acoustic features using a single GMM-tokenizer followed by discriminatively trained language models. A feature selection technique based on the Support Vector Machine (SVM) is used to model higher order n-grams. Three different ways to build this tokenizer are explored and compared using discriminative uni-gram and generative GMM-UBM. A discriminative uni-gram using very large GMM tokenizer with 24,576 components yields an EER of 1.66%, rising to 0.71% when fused with other acoustic approaches, on the NIST‟03 LRE 30s evaluation.
منابع مشابه
Speaker verification and spoken language identification using a generalized i-vector framework with phonetic tokenizations and tandem features
This paper presents a generalized i-vector framework with phonetic tokenizations and tandem features for speaker verification as well as language identification. First, the tokens for calculating the zero-order statistics is extended from the MFCC trained Gaussian Mixture Models (GMM) components to phonetic phonemes, 3-grams and tandem feature trained GMM components using phoneme posterior prob...
متن کاملAcoustic feature transformation using UBM-based LDA for speaker recognition
In state-of-the-art speaker recognition system, universal background model (UBM) plays a role of acoustic space division. Each Gaussian mixture of trained UBM represents one distinct acoustic region. The posterior probabilities of features belonging to each region are further used as core components of Baum-Welch statistics. Therefore, the quality of estimated Baum-Welch statistics depends high...
متن کاملDiscriminative acoustic language recognition via channel-compensated GMM statistics
We propose a novel design for acoustic feature-based automatic spoken language recognizers. Our design is inspired by recent advances in text-independent speaker recognition, where intraclass variability is modeled by factor analysis in Gaussian mixture model (GMM) space. We use approximations to GMMlikelihoods which allow variable-length data sequences to be represented as statistics of fixed ...
متن کاملFace recognition using mixtures of principal components
We introduce an efficient statistical modeling technique called Mixture of Principal Components (MPC). This model is a linear extension to the traditional Principal Component Analysis (PCA) and uses a mixture of eigenspaces to capture data variations. We use the model to capture face appearance variations due to pose and lighting changes. We show that this more efficient modeling leads to impro...
متن کاملGibbs Sampling for (Coupled) Infinite Mixture Models in the Stick Breaking Representation
Nonparametric Bayesian approaches to clustering, information retrieval, language modeling and object recognition have recently shown great promise as a new paradigm for unsupervised data analysis. Most contributions have focused on the Dirichlet process mixture models or extensions thereof for which efficient Gibbs samplers exist. In this paper we explore Gibbs samplers for infinite complexity ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010